2025-01-27
The majority of eye tracking analyses rely on accurate detection of fixations and/or saccades using event detection algoritms (also known as fixation filters). These measures are often automatically exported from software provided by the eye tracker manufacturer. However, pre-processing and choice of event detection algorithms can have a large impact on the results, especially if there is some level of noise in the data. This is typically the case in research about children or individuals with neurodevelopmental conditions such as autism or intellectual disability.
kollaR is a library for eye tracking analysis and visualizations specifically designed to facilitate compare and evaluate event detection algorithms. It includes functions for pre-processing, event detection and visualizations for comparing the results of different data processing pipelines.
Once an event detection algorithm has been selected, kollaR can be used to create static and dynamic visualizations of gaze behavior for presentation and dissemination and conduct analyses based on areas of interest (AOIs).
This demonstration illustrates pre-processing, event detection and comparisons between different event detection algorithms.
These demonstrations use the kollaR, ggplot2 and dplyr libraries.
kollaR is available on CRAN. Install it by writing the following line of code in the R prompt:
install.packages(“kollaR”)
We will start working with example data from one participant. The data come from a preferential looking task. Participants view two images presented to the left and right on a monitor. The images are the paintings Angelus Novus and Senecio by Paul Klee (1879-1940). Data were recorded using a Tobii Pro Spectrum at 1200 HZ. The screen resolution is 1920 x 1080 pixels.
#Show the first parts of the data
head(sample.data.unprocessed)
## id timestamp x.raw y.raw
## 1 1 0 NaN NaN
## 2 1 1 NaN NaN
## 3 1 1 NaN NaN
## 4 1 2 NaN NaN
## 5 1 4 NaN NaN
## 6 1 4 NaN NaN
The data contains timestamps and unfiltered x and y coordinates. There are several missing samples at the onset of the recording.
Let’s do some pre-processing of the data. The kollaR function process_gaze has parameters determining 1) the maximum length of gaps in the data to be interpolated over in ms, and 2) the size of the moving average filter used to smooth the X and Y coordinates (in ms).
ggplot2 will show a warning if there are gaps in the data. For convenience, we will suppress them.
gaze_processed <- process_gaze(sample.data.unprocessed, max_gap_ms = 75, filter_ms = 15)
#Plot the X coordinates of the processed data
new_plot <-ggplot(data = gaze_processed, aes(x = sample, y = x.raw))+geom_line()
suppressWarnings(
print(new_plot)
)
What effects do these filter settings have? Take a look at the plot. It can be useful to plot the unprocessed and processed data together. We will use the kollaR function filt_plot_temporal to do this. The function process_gaze will save the unprocessed x and y variables under the names x.unprocessed and y.unprocessed. Let’s plot the unprocessed and processed x coordinates during a 5000 milliseconds interval which starts at 1000 ms and ends at 6000 ms in the recording (Note that “raw” here indicates that these are data before fixation filtering but not before any processing)
plot <- filt_plot_temporal(gaze_processed, plot.window = c(1000, 6000), var1 = "x.unprocessed", var2 = "x.raw")
The red line (data after pre-processing) seems to follow the black line (data before pre-processing) closely. Note that the pre-processing procedure removes some samples that are found right before and after gaps in the data. Several sudden shifts in gaze position can be seen, probably reflecting saccades. We’ll change the settings and run the code again. This time, we increase the maximum length of gaps in the data accepted for interpolation and the length of the smoothing window.
gaze_processed2 <- process_gaze(sample.data.unprocessed, max_gap_ms = 200, filter_ms = 150)
#Plot the X coordinates of the processed data
plot <- filt_plot_temporal(gaze_processed2, plot.window = c(1000, 6000), var1 = "x.unprocessed",var2 ="x.raw")
A higher threshold for interpolation leads fewer gaps in the data. Due to the longer smoothing window, some rapid changes in x either disappear or look like slower, continuous movements of the gaze position. This would likely result in a smaller number of detected fixations and saccades.
Now, let’s filter the processed data to detect fixations. We will use kollaR to apply three event detection algorithms: the I-VT filter, the I-DT filter, and the two-means clustering algorithm. All algorithms can be adapted by changing the parameters of the functions. By default, fixations that appear very closely in time and space are merged. Set the input values distance.threshold and merge.ms.threshold to 0 to skip this.
ivt <- ivt_filter(gaze_processed,velocity.threshold = 35, min.fixation.duration = 40, one_degree = 40, distance.threshold = 0.7, merge.ms.threshold = 75)
## Calculating saccades
## Calculating fixations
## Merging adjacent fixations
Run a dispersion-based (ID-T) algorithm to detect fixations. Fixations must have a minimum duration of 40 milliseconds. All samples counted as belonging to a fixations should be within a 1 degree radius of the centroid of the fixation.
dispersion <- idt_filter(gaze_processed, dispersion.threshold = 1, min.duration = 40, one_degree = 40, distance.threshold = 0.7, merge.ms.threshold = 75)
## Merging adjacent fixations
Run a two-means clustering algorithms to detect fixations.
c2m <- cluster2m(gaze_processed, windowlength.ms = 200, distance.threshold = 0.7, min.fixation.duration = 40, weight.threshold = 2, downsampling.factors = c(10, 20), merge.ms.threshold = 75)
## Searching for fixations
## Calculating transition weights at original sampling rate
## Calculating fixation weights in downsampled data
## Downsampling by factor: 10
## Downsampling by factor: 20
## Merging adjacent fixations
The output from these functions is a list which includes fixations, raw and filtered samples, and saccades (only the I-VT filter). To look at the fixations detected by the I-VT filter, run the code below. The data frame includes coordinates, duration, onset and offset (relative to the first sample in the input data) and two data quality measures: the proportion of missing samples within the fixation period and the root mean square deviations (RMSD) of the included samples from the center of the fixation.
head(ivt[["fixations"]])
## x y duration onset offset missing.samples rmsd firstline
## 1 352.3256 430.4392 318 1623 1941 0 0.06539270 1949
## 2 392.9814 465.8413 128 1979 2107 0 0.03676825 2376
## 3 352.5374 556.4229 326 2165 2491 0 0.04073959 2599
## 4 1498.7982 448.4513 193 2610 2803 0 0.04771203 3133
## 5 1506.7132 404.1415 250 2847 3097 0 0.05040472 3417
## 6 1554.7387 400.1972 287 3126 3413 0 0.04390898 3752
## lastline fixation.filter threshold
## 1 2331 ivt 35 deg.
## 2 2529 ivt 35 deg.
## 3 2990 ivt 35 deg.
## 4 3365 ivt 35 deg.
## 5 3717 ivt 35 deg.
## 6 4097 ivt 35 deg.
To understand what the fixation filter does to your data, it’s helpful to visualize raw and filtered data together. This illustrates which periods the filter identifies as fixations and which not. We can plot the raw and filtered x coordinates against sample (that is, time) usin the function FiltPlotTemporal. The parameter ‘plot.window’ determines the onset and offset of the interval you want to plot in ms. By default, we plot the X coordinates.
Here, we plot data from the I-VT filter.
new_plot <- filt_plot_temporal(ivt[["filt.gaze"]], plot.window = c(1000, 3000), var1 = "x.raw", var2 = "x")
To compare fixation filters, we will plot the same interval in the data using the I-DT (dispersion-based) filter.
new_plot <- filt_plot_temporal(dispersion[["filt.gaze"]], plot.window = c(1000, 3000), var1 = "x.raw", var2= "x")
Create a plot of the fixations and raw samples in 2D space instead. Here, we will use the kollaR function filt_plot_2d. This function uses two data frames: filtered.data is a data frame with fixations from one or more fixation filter algorithm, and raw.data is a sample-by-sample data frame with x and y coordinates before fixation filtering. It is posible to change the radius of the plotted fixations. Here, we set it to 40 pixels which is approximately 1 degree of the visual field on the screen where stimuli were presented.
filt_plot_2d(filtered.data = ivt[["fixations"]], raw.data = ivt[["filt.gaze"]], plot.window = c(1000, 5000), fixation.radius = 40)
## Warning: Removed 141 rows containing missing values or values outside the scale range
## (`geom_point()`).
Now, we will compare the output from three different fixation detection algorithms. The first step is to add all detected fixations in a combined data frame. The second step is to visualize them with the kollaR function filt_plot_2d. This function needs a fixation data frame and sample-by-sample raw data (in this case included in the data frame ivt[[“filt.gaze]])
combined.filters = rbind(
ivt[["fixations"]], dispersion[["fixations"]], c2m[["fixations"]])
new_plot <- filt_plot_2d(raw.data = ivt[["filt.gaze"]],
filtered.data = combined.filters,
plot.window = c(2000, 3000), xres = 1920, yres = 1080, fixation.radius = 40)
It’s reassuring that all three algorithms give a similar result. What happens if we change the settings of the I-VT algorithm? The following example runs ivt.filter() with two extreme (and most likely invalid) settings. A velocity threshold of 10 degrees will identify relatively small accelerations in eye movement speed as saccades. In contrast, a velocity threshold of 75 degrees is conservative and may miss slow saccades. We will run the function ‘ivt_filter’ inside the R command suppressMessages(). This prevents messages about the progress of the functions to be printed to the prompt.
suppressMessages(
ivt_high <- ivt_filter(gaze_processed,velocity.threshold = 75, min.fixation.duration = 40, one_degree = 40)
)
suppressMessages(
ivt_low <- ivt_filter(gaze_processed,velocity.threshold = 10, min.fixation.duration = 40, one_degree = 40)
)
combined.filters = rbind(
ivt_high[["fixations"]], ivt_low[["fixations"]])
new_plot <- filt_plot_2d(raw.data = ivt[["filt.gaze"]],
filtered.data = combined.filters,
plot.window = c(1000, 3000), xres = 1920, yres = 1080, fixation.radius = 40)
This time, there are clear differences. As expected, the lower threshold identifies a higher number of fixations. Let’s have a look at the duration of the fixations detected by the I-VT filter using the default settings.
ggplot(data = ivt[["fixations"]], aes(x = 1, y = duration,))+geom_violin(fill = "darkred", color = "darkgrey", alpha = 0.2) +geom_boxplot(width = 0.5)+geom_jitter()
…and by the I-DT filter
ggplot(data = dispersion[["fixations"]], aes(x = 1, y = duration,))+geom_violin(fill = "darkred", color = "darkgrey", alpha = 0.2) +geom_boxplot(width = 0.5)+geom_jitter()
Let’s look at data from multiple participants and compare the two fixation filters. Here, we load a file with fixations detected using default settings from the I-VT and I-DT filters.
We will plot the the RMSD values from both filters. It is common in eye tracking studies that fixation filters that work well for certain groups of participants (for example, neurotypical adults) do not work well for others (for example, young autistic children). Here, we will use the output from the ivt_filter() and idt_filter() and plot fixation RMSD for seven participants using the function plot_filter_results()
plot_filter_results(sample.data.fixations)
It seems like the I-VT filter produces a small number of outliers. This makes it difficult to interpret the plots. Plot data in a more restricted range instead: Are there any differences between the two filters?
plot_filter_results(dplyr::filter(sample.data.fixations, rmsd <1))
After removing outlier values, the I-VT filter seems to give lower RMSD values (e.g., work better) for most participants
After deciding on a fixation filter, it’s usesful to visualize the data. First, let’s create a plot of the scanpath for one participant using the function static.plot()
The kollaR function ‘static_plot’ shows fixations within the interval plot.onset - plot.offset and correspond to values in the variable “onset”. Fixations must have the variables x, y, onset and offset. Here, we will plot data for the participant “id3” and include only fixations detected with the I-VT algorithm (as implemented in the function ivt_filter.
ivt.fixations <- dplyr::filter(sample.data.fixations, fixation.filter =="ivt")
data.by.id <- dplyr::filter(sample.data.fixations, id == "id3")
static <- static_plot(data.by.id, plot.onset = 1, plot.offset = 8000, background.images = NA)
Let’s plot data from all participants. Does it look reasonable?
static <- static_plot(ivt.fixations, plot.onset = 1, plot.offset = 8000, background.images = NA)
The plot will be easier to understand if we can plot the fixations on top of the images participants were watching. We can do this in static_plot() by adding a data frame with background images. This data frame must include image names an full file paths if they are not located in R:s working directory as well as the position of the images.
This type of plot is useful for presenting results, but also for validating our analysis methods. For example, if the plot indicates that participants did not look at the actual stimuli but at the side of them, there is a high chance that something went wrong in the analysis.
background.images <- data.frame(path = c("klee2.jpg", "angelus.jpg"),
min.x = c(34, 1186),
min.y = c(190, 190),
max.x = c(734, 1886),
max.y = c(890, 890))
static <- static_plot(ivt.fixations, plot.onset = 1, plot.offset = 8000, background.images = background.images)
Eye tracking data can be visualized in dynamic plots which show how gaze behaviors unfold in real time. The kollaR function animated.fixation.plot() creates a .gif animation of the eye movements of one or multiple participants. As in static_plot(), it is possible to add background images and select an interval to visualize. The resulting animation can be stored in a file and used in presentations. If background images are included, their onset and offset should be specified in the data frame.
background.images <- data.frame(path = c("klee2.jpg", "angelus.jpg"),
min.x = c(34, 1186),
min.y = c(190, 190),
max.x = c(734, 1886),
max.y = c(890, 890),
onset = c(1, 1),
offset = c(8000, 8000))
animated_fixation_plot(ivt.fixations, plot.onset = 2000, plot.offset = 8000, background.images = background.images, save.gif = FALSE, n.loops = 0, resolution.scaling = 0.5, gif.dpi = 150, show.progress = FALSE)
## Warning: Removed 1 row containing missing values or values outside the scale range
## (`geom_point()`).
Once we have visualized the data and validated the fixation filter, we will want to know whether fixations fall into the areas or interest (AOI) we define in our study. This can be done using the kollaR function aoi_test().
aoi_test() needs two data frames: one for the gaze data and one for the AOIs. The gaze data are fixations in this example, but the function can be used to annalyze saccadic latency as well. Here, we define two AOIs called “left” and “right”:
aois <- data.frame(
name =c("left", "right"),
x0 = c(34, 1186),
y0 = c(190, 190),
x1 = c(734, 1886),
y1 = c(890, 890),
type = c("rect", "rect"))
Here, we assumed that both AOIs are rectangular. Change the variable ‘type’ to “circle” to test whether fixations fall into a circular/elliptical AOI. The kollaR function draw_aois() can be used to draw rectangular or circular AOIs on a stimulus image.
Summarize fixation time and number of fixations in each AOI with the function aoi_test. Start with one participant and see the results:
data.by.id <- dplyr::filter(ivt.fixations, id =="id4")
data <- aoi_test(data.by.id, aois)
head(data)
## total.duration n.fix latency aoi
## 1 2599 9 1587 left
## 2 4371 9 3200 right
The function will give you the total fixation time in the AOI, the number of detected fixations and the latency to the first fixation detected in the AOI. Send subsets of the fixation data frame to the function if you are interested in specific intervals of the data (for example, one trial.)
Run the analysis for all participants and combine them in a data frame called dataset
dataset <- data.frame()
for (this.id in unique(ivt.fixations$id)) {
dat <- aoi_test(dplyr::filter(ivt.fixations, id == this.id), aois)
dat$id <- this.id
dataset <- rbind(dataset, dat)
}
head(dataset)
## total.duration n.fix latency aoi id
## 1 2698 10 1459 left id1
## 2 4297 12 2588 right id1
## 3 3042 13 1689 left id2
## 4 3018 10 1556 right id2
## 5 2907 10 2987 left id3
## 6 2579 8 1856 right id3
Plot fixation time as a function of AOI. What do the data look like? Any outliers?
ggplot(data= dataset, aes(x = aoi, y = total.duration, fill = aoi))+ geom_violin()+geom_boxplot(width = 0.2)+geom_jitter(width = 0.3)
It can be useful to determine whether fixations were outside rather than inside a specific AOI. If aoi_test is run with the parameter outside = TRUE, the output will summarize the number of fixations, total duration, and latency to a first fixation outside each AOI.
dataset <- data.frame()
for (this.id in unique(ivt.fixations$id)) {
dat <- aoi_test(dplyr::filter(ivt.fixations, id == this.id), aois, outside = TRUE)
dat$id <- this.id
dataset <- rbind(dataset, dat)
}
head(dataset)
## total.duration n.fix latency aoi id
## 1 4297 12 2588 left id1
## 2 2698 10 1459 right id1
## 3 4827 16 327 left id2
## 4 4851 19 327 right id2
## 5 2579 8 1856 left id3
## 6 2907 10 2987 right id3
Saccades are the rapid eye movements occuring in between fixations. They have several properties which are interesting in eye tracking research. The kollaR function ivt.filter() detects saccades and returns their onset, offset, amplitude and peak velocity. Here, we will plot the distribution of saccade amplitude in an example data set.
ggplot(data = sample.data.saccades, aes(x = amplitude,))+geom_histogram(bins = 15, fill = "darkred", color = "darkgrey")
While most saccades are shorter than 10 degrees, there is also a small number of very long saccades. Are these valid? A careful analysis of the pre-processing and filter settings can help us understand whether these are truly saccades or could be artifacts in the data.
Run the I-VT filter again. This time, we will save the velocity profiles:
ivt <- ivt_filter(gaze_processed,velocity.threshold = 35, min.fixation.duration = 40, one_degree = 40, save.velocity.profiles = TRUE)
## Calculating saccades
## Calculating fixations
## Merging adjacent fixations
Saccades have a characteristic profile of increase and decrease in eye movement velocity. Peak velocity is associated with arousal, and as such of interest to many researchers. Use the kollaR function plot.velocity.profiles() to plot saccade velocity.
Plot the velocity profiles of all detected saccades with a higher amplitude than 2 degrees:
velocity_plot <- plot_velocity_profiles(filter(ivt[["saccades"]], amplitude >2))